SkyDist: Data Mining on Skyline Objects

نویسندگان

  • Christian Böhm
  • Annahita Oswald
  • Claudia Plant
  • Michael Plavinski
  • Bianca Wackersreuther
چکیده

The skyline operator is a well established database primitive which is traditionally applied in a way that only a single skyline is computed. In this paper we use multiple skylines themselves as objects for data exploration and data mining. We define a novel similarity measure for comparing different skylines, called SkyDist. SkyDist can be used for complex analysis tasks such as clustering, classification, outlier detection, etc. We propose two different algorithms for computing SkyDist, based on Monte-Carlo sampling and on the plane sweep paradigm. In an extensive experimental evaluation, we demonstrate the efficiency and usefulness of SkyDist for a number of applications and data mining methods.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

K-Dominant Skyline Computation by Using Sort-Filtering Method

Skyline queries are useful in many applications such as multicriteria decision making, data mining, and user preference queries. A skyline query returns a set of interesting data objects that are not dominated in all dimensions by any other objects. For a high-dimensional database, sometimes it returns too many data objects to analyze intensively. To reduce the number of returned objects and to...

متن کامل

Mining Thick Skylines over Large Databases

People recently are interested in a new operator, called skyline [3], which returns the objects that are not dominated by any other objects with regard to certain measures in a multi-dimensional space. Recent work on the skyline operator [3, 15, 8, 13, 2] focuses on efficient computation of skylines in large databases. However, such work gives users only thin skylines, i.e., single objects, whi...

متن کامل

Ranking uncertain sky: The probabilistic top-k skyline operator

Many recent applications involve processing and analyzing uncertain data. In this paper, we combine the feature of top-k objects with that of skyline to model the problem of top-k skyline objects against uncertain data. The problem of efficiently computing top-k skyline objects on large uncertain datasets is challenging in both computing the top-k skyline objects is developed for discrete cases...

متن کامل

On Dominating Your Neighborhood Profitably

Recent research on skyline queries has attracted much interest in the database and data mining community. Given a database, an object belongs to the skyline if it cannot be dominated with respect to the given attributes by any other database object. Current methods have only considered so-called min/max attributes like price and quality which a user wants to minimize or maximize. However, objec...

متن کامل

Probabilistic Skyline Queries over Uncertain Moving Objects

Data uncertainty inherently exists in a large number of applications due to factors such as limitations of measuring equipments, update delay, and network bandwidth. Recently, modeling and querying uncertain data have attracted considerable attention from the database community. However, how to perform advanced analysis on uncertain data remains an interesting question. In this paper, we focus ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010